Search CORE

204 research outputs found

Inference of Markovian Properties of Molecular Sequences from NGS Data and Applications to Comparative Genomics

Author: Cannon Charles H.
Deng Minghua
Reinert Gesine
Ren Jie
Song Kai
Sun Fengzhu
Publication venue
Publication date: 03/04/2015
Field of study

Next Generation Sequencing (NGS) technologies generate large amounts of short read data for many different organisms. The fact that NGS reads are generally short makes it challenging to assemble the reads and reconstruct the original genome sequence. For clustering genomes using such NGS data, word-count based alignment-free sequence comparison is a promising approach, but for this approach, the underlying expected word counts are essential. A plausible model for this underlying distribution of word counts is given through modelling the DNA sequence as a Markov chain (MC). For single long sequences, efficient statistics are available to estimate the order of MCs and the transition probability matrix for the sequences. As NGS data do not provide a single long sequence, inference methods on Markovian properties of sequences based on single long sequences cannot be directly used for NGS short read data. Here we derive a normal approximation for such word counts. We also show that the traditional Chi-square statistic has an approximate gamma distribution, using the Lander-Waterman model for physical mapping. We propose several methods to estimate the order of the MC based on NGS reads and evaluate them using simulations. We illustrate the applications of our results by clustering genomic sequences of several vertebrate and tree species based on NGS reads using alignment-free sequence dissimilarity measures. We find that the estimated order of the MC has a considerable effect on the clustering results, and that the clustering results that use a MC of the estimated order give a plausible clustering of the species.Comment: accepted by RECOMB-SEQ 201

arXiv.org e-Print Archive

CiteSeerX

Simulating Alpine Tundra Vegetation Dynamics in Response to Global Warming in China

Author: Anderson Zhang
Jeffery Welker
Minghua Song
Publication venue: 'IntechOpen'
Publication date: 27/09/2010
Field of study

IntechOpen

Comparison of metagenomic samples using sequence signatures

Author: Bai Jiang
Fengzhu Sun
Jie Ren
Kai Song
Minghua Deng
Xuegong Zhang
Publication venue: Springer Nature
Publication date: 27/12/2012
Field of study

BACKGROUND: Sequence signatures, as defined by the frequencies of k-tuples (or k-mers, k-grams), have been used extensively to compare genomic sequences of individual organisms, to identify cis-regulatory modules, and to study the evolution of regulatory sequences. Recently many next-generation sequencing (NGS) read data sets of metagenomic samples from a variety of different environments have been generated. The assembly of these reads can be difficult and analysis methods based on mapping reads to genes or pathways are also restricted by the availability and completeness of existing databases. Sequence-signature-based methods, however, do not need the complete genomes or existing databases and thus, can potentially be very useful for the comparison of metagenomic samples using NGS read data. Still, the applications of sequence signature methods for the comparison of metagenomic samples have not been well studied. RESULTS: We studied several dissimilarity measures, including d(2), d(2)(*) and d(2)(S) recently developed from our group, a measure (hereinafter noted as Hao) used in CVTree developed from Hao’s group (Qi et al., 2004), measures based on relative di-, tri-, and tetra-nucleotide frequencies as in Willner et al. (2009), as well as standard l(p) measures between the frequency vectors, for the comparison of metagenomic samples using sequence signatures. We compared their performance using a series of extensive simulations and three real next-generation sequencing (NGS) metagenomic datasets: 39 fecal samples from 33 mammalian host species, 56 marine samples across the world, and 13 fecal samples from human individuals. Results showed that the dissimilarity measure d(2)(S) can achieve superior performance when comparing metagenomic samples by clustering them into different groups as well as recovering environmental gradients affecting microbial samples. New insights into the environmental factors affecting microbial compositions in metagenomic samples are obtained through the analyses. Our results show that sequence signatures of the mammalian gut are closely associated with diet and gut physiology of the mammals, and that sequence signatures of marine communities are closely related to location and temperature. CONCLUSIONS: Sequence signatures can successfully reveal major group and gradient relationships among metagenomic samples from NGS reads without alignment to reference databases. The d(2)(S) dissimilarity measure is a good choice in all application scenarios. The optimal choice of tuple size depends on sequencing depth, but it is quite robust within a range of choices for moderate sequencing depths

Springer - Publisher Connector

PubMed Central

Responses of soil nitrogen mineralization to temperature and moisture in alpine ecosystems on the Tibetan Plateau

Author: Gao Qiong
Ouyang Hua
Song Minghua
Tian Yuqiang
Xu Xia
Xu Xingliang
Publication venue: Published by Elsevier B.V.
Publication date: 31/12/2010
Field of study

AbstractThe responses of soil net nitrogen (N) mineralization to temperature and moisture were investigated in four alpine ecosystems of forest, shrub, meadow and steppe by laboratory incubation method with undisturbed soil cores on the Tibetan Plateau. The results indicated the soil net N mineralization varies greatly between alpine ecosystems. The soil net N mineralization rate in three incubating moisture of forest ecosystem rose markedly, and that of meadow ecosystem rose gently from temperature of 5°C to 35°C, while that of shrub and steppe ecosystems increased from temperature of 5°C to 25°C and reduced from temperature of 25°C to 35°C. At the same incubating temperature, the soil net N mineralization of four alpine ecosystems increased in the middle moisture and deceased in the low or high moisture

Elsevier - Publisher Connector

The impact of atmospheric N deposition and N fertilizer type on soil nitric oxide and nitrous oxide fluxes from agricultural and forest Eutric Regosols

Author: Cowan Nicholas
Drewer Julia
Levy Peter
Skiba Ute
Song Ling
Zhou Minghua
Zhu Bo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2020
Field of study

Agricultural and forest soils with low organic C content and high alkalinity were studied over 17 days to investigate the potential response of the atmospheric pollutant nitric oxide (NO) and the greenhouse gas nitrous oxide (N2O) on (1) increased N deposition rates to forest soil; (2) different fertilizer types to agricultural soil and (3) a simulated rain event to forest and agricultural soils. Cumulative forest soil NO emissions (148–350 ng NO-N g−1) were ~ 4 times larger than N2O emissions (37–69 ng N2O-N g−1). Contrary, agricultural soil NO emissions (21–376 ng NO-N g−1) were ~ 16 times smaller than N2O emissions (45–8491 ng N2O-N g−1). Increasing N deposition rates 10 fold to 30 kg N ha−1 yr−1, doubled soil NO emissions and NO3− concentrations. As such high N deposition rates are not atypical in China, more attention should be paid on forest soil NO research. Comparing the fertilizers urea, ammonium nitrate, and urea coated with the urease inhibitor ‘Agrotain®,’ demonstrated that the inhibitor significantly reduced NO and N2O emissions. This is an unintended, not well-known benefit, because the primary function of Agrotain® is to reduce emissions of the atmospheric pollutant ammonia. Simulating a climate change event, a large rainfall after drought, increased soil NO and N2O emissions from both agricultural and forest soils. Such pulses of emissions can contribute significantly to annual NO and N2O emissions, but currently do not receive adequate attention amongst the measurement and modeling communities

NERC Open Research Archive

Recommended from our members

A comprehensive analysis and source apportionment of metals in riverine sediments of a rural-urban watershed.

Author: Dahlgren Randy A
Ji Xiaoliang
Mei Kun
Qu Liyin
Song Qiujin
Xia Fang
Zhang Chi
Zhang Minghua
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Quantitative assessment of metal sources in sediments is essential for implementation of source control and remediation strategies. This study investigated metal contamination in sediments to assess potential ecological risks and quantify pollutant sources of metals (Cu, Zn, Pb, Cd, Cr, Co and Ni) in the Wen-Rui Tang River watershed. Total and fraction analysis indicated high pollution levels of metals. Zinc and Cd posed high ecological risk based on the risk assessment code, with the highest ecological risk found in the southwestern of the watershed. The positive matrix factorization (PMF) model was highly effective in predicting total metal concentrations and identified three contributing metal sources. An agricultural source (factor 1) contributed highly to Cu (74.1%) and Zn (42.5%), and was most prominent in the west and south-central portions of the watershed. Cd (93.5%) showed a high weighting with industrial sources (factor 2) with a hot spot in the southwest. Factor 3 was identified as a mixed natural and vehicle traffic source that showed large contribution to Cr (65.2%), Ni (63.9%) and Pb (50.7%). Spatial analysis indicated a consistent pattern between PMF-identified factors and suspected metal sources at the watershed scale demonstrating the efficacy of the PMF modeling approach for watershed analysis

eScholarship - University of California

Liao ning virus in China

Author: Dong Qiang
Fu Shihong
Li Minghua
Li Wenjuan
Liang Guodong
Liu Hong
Lu Xinjun
Lu Zhi
Tang Qing
Tong Suxiang
Zhang Song
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Liao ning virus is in the genus Seadornavirus within the family Reoviridae and has a genome composed of 12 segments of double-stranded RNA (dsRNA). It is transmitted by mosquitoes and only isolated in China to date and it is the only species within the genus Seadornavirus which was reported to have been propagated in mammalian cell lines. In the study, we report 41 new isolates from northern and southern Xinjiang Uygur autonomous region in China and describe the phylogenetic relationships among all 46 Chinese LNV isolates. Findings The phylogenetic analysis indicated that all the isolates evaluated in this study can be divided into 3 different groups that appear to be related to geographic origin based on partial nucleotide sequence of the 10th segment which is predicted to encode outer coat proteins of LNV. Bayesian coalescent analysis estimated the date of the most recent common ancestor for the current Chinese LNV isolates to be 318 (with a 95% confidence interval of 30-719) and the estimated evolutionary rates is 1.993 × 10-3 substitutions per site per year. Conclusions The results indicated that LNV may be an emerging virus at a stage that evaluated rapidly and has been widely distributed in the north part of China.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Mesoporous bioactive glass surface modified poly(lactic-co-glycolic acid) electrospun fibrous scaffold for bone regeneration

Author: Amanda Vaughn
Chaoyue Zhang
Dajiang Song
Jinsong Li
Linsheng Huang
Minghua Hu
Ruisen Zhan
Shaohua Liu
Shijie Chen
Song Wu
Wei Xu
Zhiyuan Jian
Zongmiao Wan
Publication venue: 'Dove Medical Press Ltd.'
Publication date
Field of study

Crossref